Reverse Engineering Intel Last-Level Cache Complex Addressing Using Performance Counters
نویسندگان
چکیده
Cache attacks, which exploit differences in timing to perform covert or side channels, are now well understood. Recent works leverage the last level cache to perform cache attacks across cores. This cache is split in slices, with one slice per core. While predicting the slices used by an address is simple in older processors, recent processors are using an undocumented technique called complex addressing. This renders some attacks more difficult and makes other attacks impossible, because of the loss of precision in the prediction of cache collisions. In this paper, we build an automatic and generic method for reverse engineering Intel’s last-level cache complex addressing, consequently rendering the class of cache attacks highly practical. Our method relies on CPU hardware performance counters to determine the cache slice an address is mapped to. We show that our method gives a more precise description of the complex addressing function than previous work. We validated our method by reversing the complex addressing functions on a diverse set of Intel processors. This set encompasses Sandy Bridge, Ivy Bridge and Haswell micro-architectures, with different number of cores, for mobile and server ranges of processors. We show the correctness of our function by building a covert channel. Finally, we discuss how other attacks benefit from knowing the complex addressng of a cache, such as sandboxed rowhammer.
منابع مشابه
Mapping the Intel Last-Level Cache
Modern Intel processors use an undisclosed hash function to map memory lines into last-level cache slices. In this work we develop a technique for reverse-engineering the hash function. We apply the technique to a 6-core Intel processor and demonstrate that knowledge of this hash function can facilitate cache-based side channel attacks, reducing the amount of work required for profiling the cac...
متن کاملPerformance Analysis of Speech Recognition Software
This paper characterizes the behavior of a speaker-independent large vocabulary continuous speech recognition (LVCSR) system. This system is used to dictate Chinese (Mandarin) utterances of different speakers and achieves a word recognition accuracies of 85%~96% depending on the cleanness of input signals and the complexity of the spoken sentences. Several methods are used to characterize its p...
متن کاملDemystifying Intel Branch Predictors
Improvement of branch predictors has been one of the focal points of computer architecture research during the last decade, ranging from two-level predictors to complex hybrid mechanisms. Most research efforts try to use real, already implemented, branch predictor sizes and organizations for comparison and evaluation. Yet, little is known about exact predictor implementation in Intel processors...
متن کاملA Cache Architecture for Counting Bloom Filters: Theory and Application
Within packet processing systems, lengthy memory accesses greatly reduce performance. To overcome this limitation, network processors utilize many different techniques, for example, utilizing multilevel memory hierarchies, special hardware architectures, and hardware threading. In this paper, we introduce a multilevel memory architecture for counting Bloom filters. Based on the probabilities of...
متن کاملPerformance Evaluation of the Intel Sandy Bridge Based NASA Pleiades Using Scientific and Engineering Applications
We present a performance evaluation of Pleiades based on the Intel Xeon E5-2670 processor, a fourth-generation eight-core Sandy Bridge architecture, and compare it with the previous third generation Nehalem architecture. Several architectural features have been incorporated in Sandy Bridge: (a) four memory channels as opposed to three in Nehalem; (b) memory speed increased from 1333 MHz to 1600...
متن کامل